Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: session setting to enforce queries access only home region rows #85704

Merged
merged 1 commit into from
Aug 26, 2022

Conversation

msirek
Copy link
Contributor

@msirek msirek commented Aug 7, 2022

Fixes #86228

This commit adds a new session setting, enforce_home_region, which
causes queries to error out if they need to talk to regions other than
the gateway region to answer the query.

A home region specifies the region(s) from which consistent reads from a
set of rows in a table can be served locally. The home region for a set
of rows in a multiregion table is determined differently depending on
the type of multiregion table involved:

Locality Home Region
REGIONAL BY ROW Home region determined by crdb_region column value
REGIONAL BY TABLE All rows share a single home region
GLOBAL Any region can act as the home region

When enforce_home_region is true, and a query has no home region
(for example, reading from different home regions in a REGIONAL BY ROW
table), error code 42899 (QueryHasNoHomeRegion) is returned.
When enforce_home_region is true, and a query's home region differs
from the gateway region, error code 42898
(QueryNotRunningInHomeRegion) is returned.
The error message, in some instances, provides a useful tip on possible
steps to take to allow the query to run entirely in the gateway region,
e.g.,

Query is not running in its home region. Try running the query from region 'ap-southeast-2'.

Query has no home region. Query has no home region.                                                                                
                          Try adding a filter on table.crdb_region and/or on key column (table.id)   
                          
Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability.

Query has no home region. The home region 'us-east-1' of table 'messages_rbt'                                                                                          
                          does not match the home region 'ap-southeast-2' of lookup table 'messages_rbr'.     

Support for this new session mode is being added in 3 phases.
This commit consists of phase 1, which include only simple static checks
during query compilation for the following allowed cases:

  • A scan of a table with LOCALITY REGIONAL BY TABLE with primary
    region matching the gateway region
  • A scan of a table with LOCALITY GLOBAL
  • A scan of a table with LOCALITY REGIONAL BY ROW using only local
    constraints (e.g. crdb_region = 'ca-central-1')
  • A scan of a table with LOCALITY REGIONAL BY ROW using
    locality-optimized search.
  • A locality-optimized lookup join with a LOCALITY REGIONAL BY ROW table.

Only tables in multiregion databases with ZONE survivability may be
scanned without error because with REGION survivability, ranges in a
down region may be served non-local to the gateway region, so are not
guaranteed to have low latency.

Note that locality-optimized search and locality-optimized join are not
guaranteed to scan no remote rows, but are still allowed.

Release note (sql change): A new session setting, enforce_home_region,
is added, which when true causes queries which have no home region or
which may scan rows via a database connection outside of the query's
home region to error out. Also, only tables in multiregion databases
with ZONE survivability may be scanned without error when this setting
is true because with REGION survivability, ranges in a down region may
be served non-local to the gateway region, so are not guaranteed to have
low latency.

Release justification: Low risk feature

@msirek msirek requested review from mgartner, rytaft, andy-kimball and a team August 7, 2022 00:57
@msirek msirek requested review from a team as code owners August 7, 2022 00:57
@msirek msirek requested a review from msbutler August 7, 2022 00:57
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a better name for the session setting than error_on_remote_scan, such as error_on_remote_access. Suggestions welcome. If none, I may change it to error_on_remote_access.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @rytaft)

@andy-kimball
Copy link
Contributor

There are two different kinds of errors we'll need to raise:

  1. Query does not have a home region.
  2. Query is not running in its home region.

#1 is a static error raised by the optimizer. #2 is a dynamic error raised by the execution engine.

I think we should be using "home region" language rather than "remote region" language. I think it's vital to explain the concept of a home region and use it in our docs, our errors, our blogs, etc.

So I'd be looking for a session setting name like check_home_region or enforce_home_region or similar. This is getting into PM territory, though, and I think needs to be discussed in a doc and/or at the regular MR meeting.

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, thanks. Phase 1 is only doing static checks, but does handle a few simple cases which could return Query is not running in its home region..

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @rytaft)

@msirek msirek force-pushed the 83819 branch 2 times, most recently from 05637d9 to d9c764a Compare August 7, 2022 21:47
@msirek msirek changed the title opt: session setting to error out remote table reads opt: session setting to enforce queries access only home region rows Aug 7, 2022
Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the session flag name to enforce_home_region and the modified the PR description.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @rytaft)

Copy link
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I agree that "home" may be the right terminology to use long-term, I worry about the impact that it will have on our docs short term. Even the introduction of the term here may require a fair bit of explanation.

What if we went with enforce_local_access instead? Local shouldn't need too much explanation, and it gets away from having to introduce a new concept just for this session variable?

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @rytaft)

@andy-kimball
Copy link
Contributor

The term "home" is already a basic term in use in our docs. It's how we describe REGIONAL BY ROW, in fact (https://www.cockroachlabs.com/docs/v22.1/set-locality#crdb_region):

Every row in a regional by row table has a hidden `crdb_region` column that represents the row's home region.

To not use this terminology would be a departure from how we've been describing it in our docs.

Copy link
Collaborator

@ajstorm ajstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, perfect. I didn't realize we were using it so extensively. I'm good with the enforce_home_region naming.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @rytaft)

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. This is looking great!

Reviewed 23 of 39 files at r1, 16 of 16 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andy-kimball, @mgartner, @msbutler, and @msirek)


-- commits line 10 at r2:
This is kind of misleading -- the PRIMARY region is a different concept from home region (although they are related...)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 30 at r2 (raw file):

    message_id   UUID DEFAULT gen_random_uuid(),
    message    STRING NOT NULL,
    crdb_region crdb_internal_region NOT NULL,

what is the benefit of including this column here? Seems like this would add confusion


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 40 at r2 (raw file):

    message_id   UUID DEFAULT gen_random_uuid(),
    message    STRING NOT NULL,
    crdb_region crdb_internal_region NOT NULL,

ditto


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 139 at r2 (raw file):


# Non-locality-optimized lookup join should error out.
statement error pq: Query has no home region. Try removing the join expression.

"Try removing the join expression" seems like it would not be desirable for the user. Presumably they wanted to use a join. Maybe this should suggest something like "Try adding a filter on c_id and/or crdb_region"


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 238 at r2 (raw file):


# Select from REGIONAL BY TABLE should indicate the gateway region to use.
statement error pq: Query is not running in its home region. Try running the query from region 'ap-southeast-2'.

Shouldn't this error say "try running the query from region 'us-east-1'"?


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 332 at r2 (raw file):


# Statements which previously succeeded should now fail under REGION survivability.
statement error pq: Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability.

Technically the query still has a home region, so I think this error is misleading. I'd just say something like "enforce_home_region setting cannot be combined with REGION survivability"


pkg/sql/vars.go line 2123 at r2 (raw file):

		},
	},
	// CockroachDB extension.

nit: add blank line above


pkg/sql/opt/distribution/distribution.go line 68 at r2 (raw file):

// by a non-multiregion database or a multiregion database with SURVIVE REGION
// FAILURE goal.
func errorOnInvalidMultiregionDB(evalCtx *eval.Context, tabMeta *opt.TableMeta) {

Feels like this check should go in buildScan in the optbuilder.


pkg/sql/opt/memo/expr.go line 302 at r2 (raw file):

		}
	}
	return ""

would it be better to have two return params, with ok bool to indicate if it was possible to get a single region?


pkg/sql/opt/xform/coster.go line 162 at r2 (raw file):

	// which are not locality-optimized or Distribute operations when a session
	// mode is set to error out on remote node accesses.
	remoteCostPenalty = 1e8

Why don't we just use hugeCost? It's designed for cases like this.


pkg/sql/opt/xform/coster.go line 675 at r2 (raw file):

		// the cost of using other indexes to `hugeCost`, so let's make the
		// current plan's costs expensive, but not on par with the illegal index
		// plan since we never want the plan with the illegal index to win out.

This seems overly complicated for little benefit -- I'd stick with hugeCost


pkg/sql/opt/xform/optimizer.go line 792 at r2 (raw file):

				// setLowestCostTree does this walking already, so this check is
				// included here to avoid an extra walking step.
				if !join.LocalityOptimized {

Can this check be done in the execbuilder? That's where we typically put these kinds of checks.

Copy link
Contributor

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, and @msirek)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 139 at r2 (raw file):


# Non-locality-optimized lookup join should error out.
statement error pq: Query has no home region. Try removing the join expression.

We should maybe file an issue or add a TODO somewhere to add hints to errors that point to docs with more information about what these errors mean.

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, and @rytaft)


-- commits line 10 at r2:

Previously, rytaft (Rebecca Taft) wrote…

This is kind of misleading -- the PRIMARY region is a different concept from home region (although they are related...)

Removed the reference to primary region:

A home region specifies the region(s) from which consistent reads from a
set of rows in a table can be served locally.

pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 30 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

what is the benefit of including this column here? Seems like this would add confusion

Removed the column

Code quote:

 crdb_region crdb_internal_region NOT NULL,

pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 40 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

ditto

Removed the column


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 139 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

"Try removing the join expression" seems like it would not be desirable for the user. Presumably they wanted to use a join. Maybe this should suggest something like "Try adding a filter on c_id and/or crdb_region"

OK. I guess we'd want to handle regional by row tables on either side of the lookup join. Updated the error in this case to:

Query has no home region. Try adding a filter on child.crdb_region and/or on key column (child.c_id). Try adding a filter on parent.crdb_region and/or on key column (parent.p_id).

pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 139 at r2 (raw file):

Previously, andy-kimball (Andy Kimball) wrote…

We should maybe file an issue or add a TODO somewhere to add hints to errors that point to docs with more information about what these errors mean.

OK. I've created #86178. Perhaps you could update the issue with more details, or your vision here, if there are specific things you'd like to see. Also, I'm not sure if we need to engage someone else to get new sections added to docs. Maybe @rytaft could help coordinate this.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 238 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Shouldn't this error say "try running the query from region 'us-east-1'"?

Fixed, thanks.

Code quote:

 Select from REGIONAL BY TABLE should indicate the gateway region to use.

pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 332 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Technically the query still has a home region, so I think this error is misleading. I'd just say something like "enforce_home_region setting cannot be combined with REGION survivability"

Fixed


pkg/sql/vars.go line 2123 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: add blank line above

Done

Code quote:

`enforce_home_region`: {

pkg/sql/opt/distribution/distribution.go line 68 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Feels like this check should go in buildScan in the optbuilder.

That's a possibility, but if we do the check before normalization, then if the scan is not actually used, like when we have an unsatisfiable filter:
EXPLAIN SELECT * FROM messages WHERE account_id = 1 AND account_id = 2
... we will end up unnecessarily erroring out. Thoughts?


pkg/sql/opt/memo/expr.go line 302 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

would it be better to have two return params, with ok bool to indicate if it was possible to get a single region?

Modified as suggested

Code quote:

func (e *DistributeExpr) GetRegionOfDistribution()

pkg/sql/opt/xform/coster.go line 162 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Why don't we just use hugeCost? It's designed for cases like this.

Returning hugeCost may cause the cost of a plan with a forced index (hinted index) to be more expensive than a plan without the forced index, since hinting uses this same mechanism. Then, when the optimizer sees an index picked which is not the hinted index, the query errors out, saying that an illegal index was chosen (I actually ran into this issue). So, if we want to use hugeCost here, we'd need a different mechanism to enforce selection of the hinted index.


pkg/sql/opt/xform/coster.go line 675 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This seems overly complicated for little benefit -- I'd stick with hugeCost

See comment above.


pkg/sql/opt/xform/optimizer.go line 792 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Can this check be done in the execbuilder? That's where we typically put these kinds of checks.

OK, done. This changes the behavior so we don't error out EXPLAIN. I was previously wondering if we wanted to allow EXPLAIN to run, even if the actual query errors out. It seems like we would, to allow the user to analyze the chosen query plan and maybe get a better sense of how to correct the query.

Code quote:

TODO(msirek): Remove this in phase 2 or 3 when we can dynamical

@rytaft
Copy link
Collaborator

rytaft commented Aug 22, 2022

pkg/sql/opt/exec/execbuilder/relational.go line 1912 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'm curious, though, why we should allow locality-optimized search to run, but not locality-optimized join.

Oh, that wasn't my intention. You could look at the locality optimized flag when determining the distribution.

Sorry -- I mean when determining the lookup distribution for the purpose of cost estimation.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, published my other changes too early -- just added a few additional comments

Reviewed 4 of 38 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @msirek, and @taroface)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 160 at r5 (raw file):


# Locality optimized lookup join or lookup join that might access multiple
# regions is not allowed in phase 1.

I think this should be allowed.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 492 at r5 (raw file):

SELECT * FROM non_multiregion_test_db.messages, multi_region_test_db.messages_global

# Scans from ZONE survivable tables with contradictions in predicates are not allowed.

do you mean non-multiregion?


pkg/sql/opt/cat/table.go line 168 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

why not just return crdb_region if it's not overridden?

(it's also fine to leave as-is if this is more effort for some reason)


pkg/sql/opt/distribution/distribution.go line 51 at r5 (raw file):

		// operation to be applied on the input to the lookup join instead of on
		// the lookup join itself.
		provided = BuildLookupJoinLookupTableDistribution(t.Memo().EvalContext(), t, t.Memo().RootIsExplain())

Sorry, I realize I might have led you astray before. This lookup join lookup table distribution is going to be great for costing purposes, but it should not be used to determine the output distribution of the lookup join. The output distribution of a lookup join should be the same as its input distribution (think about a DistSQL diagram with a join reader). So it's awesome that you wrote the BuildLookupJoinLookupTableDistribution function -- but I think it should only be called from the coster, not here.


pkg/sql/opt/distribution/distribution.go line 114 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Why are you returning in each case if Any() is true? Is this due to the "empty distribution" = all regions issue mentioned below?

This will probably change anyway based on my comment above...


pkg/sql/opt/exec/execbuilder/relational.go line 1847 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: false /* ... /, true / ... */

I meant nit: false /* ... */, true /* ... */


pkg/sql/opt/xform/coster.go line 948 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you should probably put this back in -- for a lookup join, the output distribution should match the input distribution. But the cost of the lookup join needs to take into account the distribution of the lookup table. So if the lookups are not local (or not locality optimized), we'll need to add largeDistributeCost here.

(ditto for inverted join below)

@rytaft
Copy link
Collaborator

rytaft commented Aug 22, 2022

pkg/sql/opt/props/physical/distribution.go line 117 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

We don't make special exceptions for explain anywhere else. Why should we do that in this case?

e.g, if you try to force an index that cannot be used, I think we still error with EXPLAIN

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. No worries.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @rytaft, and @taroface)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 160 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think this should be allowed.

OK. I misinterpreted your earlier comment about not checking for locality-optimized join.


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 492 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

do you mean non-multiregion?

Fixed, thanks.


pkg/sql/opt/metadata.go line 571 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Why would we want to always qualify for an error message if there's no ambiguity? do we need this new param?

Just for ease of use. For example, if a query contains 20 tables, specifying the alias is the message points the user directly to the table in question instead of making them spend time examining all of their tables to find the one with the referenced columns.

Code quote:

If there's another column in the metadata with the same column alias but

pkg/sql/opt/cat/table.go line 168 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

(it's also fine to leave as-is if this is more effort for some reason)

No reason, just following what was present in the data structure. Modified it to return the home region column name in all cases. Now this method can also be used to detect if a table is REGIONAL BY ROW.

Code quote:

RegionalByRowAsColNameOverride

pkg/sql/opt/distribution/distribution.go line 51 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Sorry, I realize I might have led you astray before. This lookup join lookup table distribution is going to be great for costing purposes, but it should not be used to determine the output distribution of the lookup join. The output distribution of a lookup join should be the same as its input distribution (think about a DistSQL diagram with a join reader). So it's awesome that you wrote the BuildLookupJoinLookupTableDistribution function -- but I think it should only be called from the coster, not here.

No worries. Applied it only to costing now. I've added in code to set the distribution of locality optimized join to the gateway region to avoid distribute ops and extra added cost. Without this code we don't pick locality-optimized join sometimes. What do you think about this approach?


pkg/sql/opt/distribution/distribution.go line 114 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This will probably change anyway based on my comment above...

Yes, this is to try and fix the overloaded meaning of an empty distribution. I've added a TODO here and opened issue #86641.


pkg/sql/opt/distribution/distribution.go line 127 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

It would be great to eliminate this overloaded meaning of empty regions as soon as possible... maybe open an issue / leave a TODO?

Code is removed. Added a TODO here and opened issue #86641.


pkg/sql/opt/distribution/distribution.go line 135 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

What about a scan? Why not just make a separate case for ValuesExpr?

I was thinking the absence of a relational expression (for example, VALUES, scalar function call, etc.), could be executed locally, so that's the condition in the code I added instead of calling out all possible scalar expressions. Removed this; to be handled through #86641.


pkg/sql/opt/distribution/distribution.go line 171 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

is the first lookup expression always guaranteed to constrain the first index column?

It seems like it. addEqualityColumns appends to keyCols or lookupExpr in the same order as index columns are defined:

// All the lookup conditions must apply to the prefix of the index and so
// the projected columns created must be created in order.
for j := 0; j < numIndexKeyCols; j++ {
idxCol := b.table.IndexColumnID(index, j)
idxColIsDesc := index.Column(j).Descending
if eqIdx, ok := rightEq.Find(idxCol); ok {
addEqualityColumns(leftEq[eqIdx], idxCol)

... and this is later copied into LookupJoinPrivate.LookupExpr.

Code quote:

firstIndexColEqExpr := lookupJoin.LookupJoinPrivate.LookupExpr[0].Condition

pkg/sql/opt/exec/execbuilder/relational.go line 1912 at r3 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Sorry -- I mean when determining the lookup distribution for the purpose of cost estimation.

No worries. Modified the code to use the lookup distribution for lookup join costing. Other joins will incur the largeDistributeCost from the distribute operation.


pkg/sql/opt/exec/execbuilder/relational.go line 1847 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I meant nit: false /* ... */, true /* ... */

Fixed

Code quote:

colName := b.mem.Metadata().QualifiedAlias(colID, false, true, b.catalog)

pkg/sql/opt/exec/execbuilder/relational.go line 1992 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

should this be an errors.AssertionFailedf?

Fixed


pkg/sql/opt/memo/expr_format.go line 1427 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: false /* ... */

Fixed


pkg/sql/opt/memo/memo.go line 508 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Seems like a bug if we have a distribute op above EXPLAIN...

I think that might have been a result of the code added to modify the distribution of lookup join. Skipping over DistributeOp is no longer needed and is removed.


pkg/sql/opt/props/physical/distribution.go line 117 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

e.g, if you try to force an index that cannot be used, I think we still error with EXPLAIN

When error checking for lookup join was moved to the execbuilder, it caused EXPLAIN statements to not error out. To maintain consistency with those errors (and also to allow EXPLAIN to be run to help the user see the source of the locality issues), all erroring is now avoided for EXPLAIN.

Code quote (from pkg/sql/opt/memo/expr_format.go):

md.QualifiedAlias(

pkg/sql/opt/props/physical/distribution.go line 198 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

PRIMARY region is different from the region of REGIONAL BY TABLE -- you can have a table with LOCALITY REGIONAL BY TABLE IN PRIMARY REGION or LOCALITY REGIONAL BY TABLE IN <some other region>

OK, I guess that terminology doesn't apply to tables (e.g. the REGIONAL BY TABLE primary region), just databases. Updated the comment to use "home region".

Code quote:

Use the leaseholder region(s), which should be the same as the PRIMARY

pkg/sql/opt/xform/coster.go line 948 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

(ditto for inverted join below)

Re-added it. We can't know at this point what the distribution of the Input is, so adding a large cost if not locality optimized might miss some local regular lookup joins. I think the distribute op and added costs would take care of the locality optimized case. I don't follow the comment about inverted join. Details?


pkg/sql/opt/xform/testdata/rules/join line 10739 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This distribute operator shouldn't be here.

Fixed

Code quote:

SELECT * FROM def_part WHERE NOT EXISTS (SELECT * FROM abc_part WHERE e = a) AND d = 1

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 15 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @msirek, and @taroface)


pkg/sql/opt/cat/table.go line 169 at r6 (raw file):

	// REGIONAL BY ROW TABLE, otherwise "", false is returned.
	// This column is name `crdb_region` by default, but may be overridden with a
	// different name in a `REGIONAL BY ROW AS` DDL clause.

Do you have any tests with REGIONAL BY ROW AS?


pkg/sql/opt/distribution/distribution.go line 51 at r5 (raw file):

Previously, msirek (Mark Sirek) wrote…

No worries. Applied it only to costing now. I've added in code to set the distribution of locality optimized join to the gateway region to avoid distribute ops and extra added cost. Without this code we don't pick locality-optimized join sometimes. What do you think about this approach?

Seems fine for now, since I think this is the same assumption that the exploration rules generating locality optimized lookup joins make. We can revisit later if need be.


pkg/sql/opt/distribution/distribution.go line 171 at r5 (raw file):

Previously, msirek (Mark Sirek) wrote…

It seems like it. addEqualityColumns appends to keyCols or lookupExpr in the same order as index columns are defined:

// All the lookup conditions must apply to the prefix of the index and so
// the projected columns created must be created in order.
for j := 0; j < numIndexKeyCols; j++ {
idxCol := b.table.IndexColumnID(index, j)
idxColIsDesc := index.Column(j).Descending
if eqIdx, ok := rightEq.Find(idxCol); ok {
addEqualityColumns(leftEq[eqIdx], idxCol)

... and this is later copied into LookupJoinPrivate.LookupExpr.

Is there a way we can ensure that doesn't change? Maybe add a comment in addEqualityColumns and/or add a test case that will fail if it changes?


pkg/sql/opt/distribution/distribution.go line 120 at r6 (raw file):

		if localityOptimizedJoin {
			provided.FromLocality(evalCtx.Locality)
		}

nit: You could just move this up to where you check for a LookupJoinExpr and break from the switch


pkg/sql/opt/exec/execbuilder/relational.go line 763 at r6 (raw file):

	if b.evalCtx.SessionData().EnforceHomeRegion && !b.mem.RootIsExplain() && !b.skipScanExprCollection {
		if b.builtScans == nil {
			b.builtScans = make([]*memo.ScanExpr, 0, 2)

Why cap=2?


pkg/sql/opt/exec/execbuilder/relational.go line 1750 at r6 (raw file):

		return errors.AssertionFailedf("The gateway region could not be determined while enforcing query home region.")
	}
	for i, scan := range b.builtScans {

Instead of this mechanism, it seems like it might be better to take advantage of the fact that you've set expressions to have hugeCost in the coster, and only call this function if you find that a DistributeExpr has hugeCost. At that point, you know you'll be throwing an error anyway, so it's ok to traverse the expression tree again to pick out the scans. I'd rather not add more unnecessary bookkeeping to the happy path.


pkg/sql/opt/exec/execbuilder/relational.go line 2076 at r6 (raw file):

		if rel.Op() == opt.ScanOp {
			b.skipScanExprCollection = true
		}

I think you can remove this code if you take my suggestion above


pkg/sql/opt/xform/coster.go line 948 at r4 (raw file):

Previously, msirek (Mark Sirek) wrote…

Re-added it. We can't know at this point what the distribution of the Input is, so adding a large cost if not locality optimized might miss some local regular lookup joins. I think the distribute op and added costs would take care of the locality optimized case. I don't follow the comment about inverted join. Details?

Inverted join is similar to lookup join in that it performs lookups into an index using the kv api (the index just happens to be inverted)

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @rytaft, and @taroface)


pkg/sql/opt/cat/table.go line 169 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Do you have any tests with REGIONAL BY ROW AS?

Yes, here:
https://github.com/msirek/cockroach/blob/d88c22003dac4df02eb11b3e6907a8d2070d97ef/pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error#L377-L381


pkg/sql/opt/distribution/distribution.go line 51 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Seems fine for now, since I think this is the same assumption that the exploration rules generating locality optimized lookup joins make. We can revisit later if need be.

OK, thanks.


pkg/sql/opt/distribution/distribution.go line 171 at r5 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Is there a way we can ensure that doesn't change? Maybe add a comment in addEqualityColumns and/or add a test case that will fail if it changes?

I found method getConstPrefixFilter which ensures that the filter is on the first index column. I moved this to be a method of LookupJoinPrivate and am using it now.

Code quote:

lookupJoin.LookupJoinPrivate.LookupExpr

pkg/sql/opt/distribution/distribution.go line 120 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: You could just move this up to where you check for a LookupJoinExpr and break from the switch

done

Code quote:

// Locality-optimized join is considered to be local.

pkg/sql/opt/exec/execbuilder/relational.go line 763 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Why cap=2?

Large enough to handle simple 2-table join queries without wasting memory.


pkg/sql/opt/exec/execbuilder/relational.go line 1750 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Instead of this mechanism, it seems like it might be better to take advantage of the fact that you've set expressions to have hugeCost in the coster, and only call this function if you find that a DistributeExpr has hugeCost. At that point, you know you'll be throwing an error anyway, so it's ok to traverse the expression tree again to pick out the scans. I'd rather not add more unnecessary bookkeeping to the happy path.

OK, changed skipScanExprCollection to doScanExprCollection so by default we don't collect the Scans. If the Distribute hashugeCost, we set doScanExprCollectionto true and traverse the tree again to populate thebuiltScansslice before callingenforceScanWithHomeRegion. BTW, I added code to ensure a DistributeExpris never built with anExplainExpr` as its input.


pkg/sql/opt/exec/execbuilder/relational.go line 2076 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you can remove this code if you take my suggestion above

We still need this code because the lookup join may be locality optimized, in which case the Scan needs to be ignored.

Code quote:

enforceScanWithHomeRegion

pkg/sql/opt/xform/coster.go line 948 at r4 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Inverted join is similar to lookup join in that it performs lookups into an index using the kv api (the index just happens to be inverted)

Thanks, added handling for inverted join, plus inverted join as input to a lookup join.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 9 files at r7, 10 of 10 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @msirek, and @taroface)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 599 at r8 (raw file):


# Inverted join doing lookup into a REGIONAL BY ROW table is not allowed.
statement error pq:

nit: this is a regex so you should be able to include some part of the error message


pkg/sql/opt/distribution/distribution.go line 149 at r8 (raw file):

	} else if lookupTable.IsRegionalByRow() {
		if len(lookupJoin.KeyCols) > 0 {
			projectExpr := lookupJoin.Input

nit: rename variable projectExpr since it's no longer necessarily a project


pkg/sql/opt/distribution/distribution.go line 152 at r8 (raw file):

			firstKeyColID := lookupJoin.LookupJoinPrivate.KeyCols[0]
			if projectExpr.Op() == opt.InvertedJoinOp {
				invertedJoinExpr, _ := projectExpr.(*memo.InvertedJoinExpr)

nit: instead of checking the op, you can just do

if invertedJoinExpr, ok := projectExpr.(*memo.InvertedJoinExpr); ok {

here and below with constExpr


pkg/sql/opt/distribution/distribution.go line 163 at r8 (raw file):

					}
				}
			} else if projection, ok := projectExpr.(*memo.ProjectExpr); ok {

nit: rename projection to projectExpr


pkg/sql/opt/distribution/distribution.go line 195 at r8 (raw file):

// statically determined.
func BuildInvertedJoinLookupTableDistribution(
	evalCtx *eval.Context, invertedJoin *memo.InvertedJoinExpr, forExplain bool,

I'm still pretty negative about this special forExplain param you've added everywhere and the checks for RootIsExplain. What happens if you remove them all?


pkg/sql/opt/exec/execbuilder/relational.go line 763 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

Large enough to handle simple 2-table join queries without wasting memory.

Consider adding a comment to that effect


pkg/sql/opt/exec/execbuilder/relational.go line 2076 at r6 (raw file):

Previously, msirek (Mark Sirek) wrote…

We still need this code because the lookup join may be locality optimized, in which case the Scan needs to be ignored.

If the lookup join is locality optimized, the input should also have a local distribution. If that's not the case today, maybe add a TODO to get rid of this.


pkg/sql/opt/exec/execbuilder/relational.go line 1805 at r8 (raw file):

	if b.evalCtx.SessionData().EnforceHomeRegion && !b.mem.RootIsExplain() {
		if distribute.Cost() >= xform.LargeDistributeCost {

This cost check isn't needed since EnforceHomeRegion automatically means that any no-op distribution will cause an error.


pkg/sql/opt/exec/execbuilder/relational.go line 2038 at r8 (raw file):

						}
					}
				}

Looks like all this code is basically duplicated from distribution.go. Is there any way to refactor a bit to avoid the duplication? (also, I have all the same nits here as I had in that file)


pkg/sql/opt/exec/execbuilder/relational.go line 2093 at r8 (raw file):

		}
	}
	saveDoScanExprCollection := b.doScanExprCollection

this should be inside the enforceHomeRegion block, since it's inside the block below.


pkg/sql/opt/exec/execbuilder/relational.go line 2114 at r8 (raw file):

		//               determine if the lookup will be local, or after #69617 is
		//               merged.
		err = b.handleRemoteLookupJoinError(join)

I think you should check for LargeDistributeCost before calling this.


pkg/sql/opt/exec/execbuilder/relational.go line 2312 at r8 (raw file):

func (b *Builder) buildInvertedJoin(join *memo.InvertedJoinExpr) (execPlan, error) {
	enforceHomeRegion := b.evalCtx.SessionData().EnforceHomeRegion && !b.mem.RootIsExplain()
	saveDoScanExprCollection := b.doScanExprCollection

this should be inside the enforceHomeRegion block, since it's inside the block below.


pkg/sql/opt/exec/execbuilder/relational.go line 2320 at r8 (raw file):

		}
		if rel.Op() == opt.ScanOp {
			b.doScanExprCollection = true

Why are you setting this to true if we're not in an error case?

Also, we don't support locality optimized inverted joins, so I think you can just remove this.


pkg/sql/opt/exec/execbuilder/relational.go line 2333 at r8 (raw file):

		//               determine if the lookup will be local, or after #69617 is
		//               merged.
		err = b.handleRemoteInvertedJoinError(join)

I think you should check for LargeDistributeCost before calling this.


pkg/sql/opt/memo/expr.go line 812 at r8 (raw file):

		if filter.Condition.Op() == opt.EqOp {
			leftChild := filter.Condition.Child(0)
			if leftChild.Op() == opt.VariableOp {

nit: here you can do:
if variableExpr, ok := leftChild.(*VariableExpr); ok {


pkg/sql/opt/memo/expr.go line 820 at r8 (raw file):

		}
	}
	return nil /* expr */, false /* ok */

nit: we don't usually include comments like this for return params


pkg/sql/opt/memo/expr.go line 834 at r8 (raw file):

// constant values. If such a filter is found, GetConstPrefixFilter returns the
// position of the filter and ok=true. Otherwise, returns ok=false.
func (lj *LookupJoinPrivate) GetConstPrefixFilter(memo *Memo) (pos int, ok bool) {

nit: doesn't seem like you need the whole memo -- why not just pass what you need? Maybe just metadata?

Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @rytaft, and @taroface)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 599 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: this is a regex so you should be able to include some part of the error message

Fixed


pkg/sql/opt/distribution/distribution.go line 149 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: rename variable projectExpr since it's no longer necessarily a project

Done

Code quote:

projectExpr := lookupJoin.Input

pkg/sql/opt/distribution/distribution.go line 152 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: instead of checking the op, you can just do

if invertedJoinExpr, ok := projectExpr.(*memo.InvertedJoinExpr); ok {

here and below with constExpr

Done

Code quote:

== opt.InvertedJoinOp

pkg/sql/opt/distribution/distribution.go line 163 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: rename projection to projectExpr

Done

Code quote:

projection, ok := projectExpr.(*memo.ProjectExpr); ok

pkg/sql/opt/distribution/distribution.go line 195 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I'm still pretty negative about this special forExplain param you've added everywhere and the checks for RootIsExplain. What happens if you remove them all?

Removed them. Since most checks are in the execbuilder there may not be a large impact, but possibly there will be some cases where we'd be unable to get an explain with the session flag on.


pkg/sql/opt/exec/execbuilder/relational.go line 763 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Consider adding a comment to that effect

Done


pkg/sql/opt/exec/execbuilder/relational.go line 2076 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

If the lookup join is locality optimized, the input should also have a local distribution. If that's not the case today, maybe add a TODO to get rid of this.

One problem case is when we have a lookup join with each side of the join in a different region, we want to make sure the Query has no home region error is returned. Removing this code causes the error Query is not running in its home region message to be returned because the lookup table is not represented as a Scan. Also, if by collecting all of the Scans, even those under locality-optimized joins, an RBR table input to a locality optimized join might not have a single home region, so we don't want that Scan to be processed and cause the query to error out.


pkg/sql/opt/exec/execbuilder/relational.go line 1805 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

This cost check isn't needed since EnforceHomeRegion automatically means that any no-op distribution will cause an error.

Removed

Code quote:

if distribute.Cost() >= xform.LargeDistributeCost {

pkg/sql/opt/exec/execbuilder/relational.go line 2038 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Looks like all this code is basically duplicated from distribution.go. Is there any way to refactor a bit to avoid the duplication? (also, I have all the same nits here as I had in that file)

Created function GetDEnumAsStringFromConstantExpr for this.

Code quote:

saveDoScanExprCollection := b.doScanExprCollection

pkg/sql/opt/exec/execbuilder/relational.go line 2093 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

this should be inside the enforceHomeRegion block, since it's inside the block below.

Moved


pkg/sql/opt/exec/execbuilder/relational.go line 2114 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you should check for LargeDistributeCost before calling this.

That could only be done if the distribution of the Input to the lookup join or inverted join were known at costing time. Since it's not, we might not add the LargeDistributeCost in the coster, but by the time the distributions are all filled in, we have a join in hand which needs to error out.

Code quote:

TODO(msirek): Update this in phase 

pkg/sql/opt/exec/execbuilder/relational.go line 2312 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

this should be inside the enforceHomeRegion block, since it's inside the block below.

Moved it


pkg/sql/opt/exec/execbuilder/relational.go line 2320 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Why are you setting this to true if we're not in an error case?

Also, we don't support locality optimized inverted joins, so I think you can just remove this.

Corrected this to set it to false. It's actually an issue with returning the correct error message on mismatched left and right inputs (see previous comment).


pkg/sql/opt/exec/execbuilder/relational.go line 2333 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

I think you should check for LargeDistributeCost before calling this.

See previous comment. The coster is not aware of the Input distribution. Also, the Distribute operation may be the one recording the LargeDistributeCost, and it may be sitting above the join and not be accessible here if the Input itself it local, but the join as a whole, is not.

Code quote:

b.handleRemoteInvertedJoinError(join)

pkg/sql/opt/memo/expr.go line 812 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: here you can do:
if variableExpr, ok := leftChild.(*VariableExpr); ok {

Done

Code quote:

func (ij *InvertedJoinPrivate) GetConstExprFromFilter(colID opt.ColumnID) (expr opt.Expr, ok bool) {

pkg/sql/opt/memo/expr.go line 820 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: we don't usually include comments like this for return params

Removed


pkg/sql/opt/memo/expr.go line 834 at r8 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: doesn't seem like you need the whole memo -- why not just pass what you need? Maybe just metadata?

Done

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: thanks for working through all these details

Reviewed 19 of 19 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner, @msbutler, @msirek, and @taroface)


pkg/sql/opt/distribution/distribution.go line 195 at r8 (raw file):

Previously, msirek (Mark Sirek) wrote…

Removed them. Since most checks are in the execbuilder there may not be a large impact, but possibly there will be some cases where we'd be unable to get an explain with the session flag on.

Thank you!


pkg/sql/opt/exec/execbuilder/relational.go line 2012 at r9 (raw file):

		if lookupTable.IsRegionalByRow() {
			if len(join.KeyCols) > 0 {
				projectExpr := join.Input

nit: projectExpr -> inputExpr


pkg/sql/opt/exec/execbuilder/relational.go line 2014 at r9 (raw file):

				projectExpr := join.Input
				firstKeyColID := join.KeyCols[0]
				if projectExpr.Op() == opt.InvertedJoinOp {

nit: if invertedJoinExpr, ok := project.(*memo.InvertedJoinExpr); ok {


pkg/sql/opt/exec/execbuilder/relational.go line 2021 at r9 (raw file):

						}
					}
				} else if projection, ok := projectExpr.(*memo.ProjectExpr); ok {

nit: projection -> projectExpr

@msirek msirek force-pushed the 83819 branch 2 times, most recently from a36d187 to 951604a Compare August 25, 2022 22:43
Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @msbutler, @rytaft, and @taroface)


pkg/sql/opt/exec/execbuilder/relational.go line 2012 at r9 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: projectExpr -> inputExpr

Done

Code quote:

projectExpr := join.Input

pkg/sql/opt/exec/execbuilder/relational.go line 2014 at r9 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: if invertedJoinExpr, ok := project.(*memo.InvertedJoinExpr); ok {

Done


pkg/sql/opt/exec/execbuilder/relational.go line 2021 at r9 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

nit: projection -> projectExpr

Done

Code quote:

projection, ok := projectExpr

@msirek msirek force-pushed the 83819 branch 4 times, most recently from e4dcf6e to 839a8ce Compare August 26, 2022 00:43
Fixes cockroachdb#86228

This commit adds a new session setting, `enforce_home_region`, which
causes queries to error out if they need to talk to regions other than
the gateway region to answer the query.

A home region specifies the region(s) from which consistent reads from a
set of rows in a table can be served locally. The home region for a set
of rows in a multiregion table is determined differently depending on
the type of multiregion table involved:
| Locality | Home Region |
| -------- | ----------- |
| REGIONAL BY ROW | Home region determined by crdb_region column value |
| REGIONAL BY TABLE | All rows share a single home region |
| GLOBAL | Any region can act as the home region |

When `enforce_home_region` is true, and a query has no home region
(for example, reading from different home regions in a REGIONAL BY ROW
table), error code 42899 (`QueryHasNoHomeRegion`) is returned.
When `enforce_home_region` is true, and a query's home region differs
from the gateway region, error code 42898
(`QueryNotRunningInHomeRegion`) is returned.
The error message, in some instances, provides a useful tip on possible
steps to take to allow the query to run entirely in the gateway region,
e.g.,
```
Query is not running in its home region. Try running the query from region 'ap-southeast-2'.

Query has no home region. Query has no home region.
                          Try adding a filter on table.crdb_region and/or on key column (table.id)

Query has no home region. Try accessing only tables in multi-region databases with ZONE survivability.

Query has no home region. The home region 'us-east-1' of table 'messages_rbt'
                          does not match the home region 'ap-southeast-2' of lookup table 'messages_rbr'.
```

Support for this new session mode is being added in 3 phases.
This commit consists of phase 1, which include only simple static checks
during query compilation for the following allowed cases:
- A scan of a table with `LOCALITY REGIONAL BY TABLE` with primary
region matching the gateway region
- A scan of a table with `LOCALITY GLOBAL`
- A scan of a table with `LOCALITY REGIONAL BY ROW` using only local
constraints (e.g. crdb_region = 'ca-central-1')
- A scan of a table with `LOCALITY REGIONAL BY ROW` using
locality-optimized search.

Only tables in multiregion databases with ZONE survivability may be
scanned without error because with REGION survivability, ranges in a
down region may be served non-local to the gateway region, so are not
guaranteed to have low latency.

Note that locality-optimized search is not guaranteed to scan no remote
rows, but is still allowed. Locality-optimized join is disallowed when
`enforce_home_region` is true.

Release note (sql change): A new session setting, enforce_home_region,
is added, which when true causes queries which have no home region or
which may scan rows via a database connection outside of the query's
home region to error out. Also, only tables in multiregion databases
with ZONE survivability may be scanned without error when this setting
is true because with REGION survivability, ranges in a down region may
be served non-local to the gateway region, so are not guaranteed to have
low latency.

Release justification: Low risk feature
Copy link
Contributor Author

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!
bors r+

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner, @msbutler, @rytaft, and @taroface)

@craig
Copy link
Contributor

craig bot commented Aug 26, 2022

Build succeeded:

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 5 files at r10, 12 of 12 files at r11, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/ccl/logictestccl/testdata/logic_test/multi_region_remote_access_error line 173 at r11 (raw file):


# Zone configs sometimes are not available right away. Issue a couple queries
# that take a while to compile to make sure they're available.

FYI, there is a sleep directive in logictests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

opt: support option to error if query cannot run in a single region
5 participants